Author: "Xu, Kai" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Kai"' showing total 11,869 results

Start Over Author "Xu, Kai"

11,869 results on '"Xu, Kai"'

1. Deep Tiny Network for Recognition-Oriented Face Image Quality Assessment

Author: Peng, Baoyun, primary, Liu, Min, additional, Zhang, Zhaoning, additional, Xu, Kai, additional, and Li, Dongsheng, additional
Published: 2024
Full Text: View/download PDF

2. Design of Digital Image Information Security Encryption Method Based on Deep Learning

Author: Sha, Licheng, primary, Duan, Peng, additional, Zhao, Xinchen, additional, Xu, Kai, additional, and Xi, Shaoqing, additional
Published: 2024
Full Text: View/download PDF

3. Base-Editor-Mediated circRNA Knockout by Targeting Predominantly Back-Splice Sites

Author: Ma, Xu-Kai, primary, Gao, Xiang, additional, Cao, Mei, additional, and Yang, Li, additional
Published: 2024
Full Text: View/download PDF

4. Dynamic Multi-modal Prompting for Efficient Visual Grounding

Author: Wu, Wansen, primary, Liu, Ting, additional, Wang, Youkai, additional, Xu, Kai, additional, Yin, Quanjun, additional, and Hu, Yue, additional
Published: 2023
Full Text: View/download PDF

5. Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields

Author: Huang, Yuhang, Zou, SHilong, Liu, Xinwang, and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper presents a novel latent 3D diffusion model for the generation of neural voxel fields, aiming to achieve accurate part-aware structures. Compared to existing methods, there are two key designs to ensure high-quality and accurate part-aware generation. On one hand, we introduce a latent 3D diffusion process for neural voxel fields, enabling generation at significantly higher resolutions that can accurately capture rich textural and geometric details. On the other hand, a part-aware shape decoder is introduced to integrate the part codes into the neural voxel fields, guiding the accurate part decomposition and producing high-quality rendering results. Through extensive experimentation and comparisons with state-of-the-art methods, we evaluate our approach across four different classes of data. The results demonstrate the superior generative capabilities of our proposed method in part-aware shape generation, outperforming existing state-of-the-art methods.
Published: 2024

6. On-demand shaped photon emission based on a parametrically modulated qubit

Author: Li, Xiang, Li, Sheng-Yong, Zhao, Si-Lu, Mei, Zheng-Yang, He, Yang, Deng, Cheng-Lin, Liu, Yu, Liu, Yan-Jun, Liang, Gui-Han, Wang, Jin-Zhe, Song, Xiao-Hui, Xu, Kai, Heng, Fan, Zhang, Yu-Xiang, Xiang, Zhong-Cheng, and Zheng, Dong-Ning
Subjects: Quantum Physics
Abstract: In the circuit quantum electrodynamics architectures, to realize a long-range quantum network mediated by flying photon, it is necessary to shape the temporal profile of emitted photons to achieve high transfer efficiency between two quantum nodes. In this work, we demonstrate a new single-rail and dual-rail time-bin shaped photon generator without additional flux-tunable elements, which can act as a quantum interface of a point-to-point quantum network. In our approach, we adopt a qubit-resonator-transmission line configuration, and the effective coupling strength between the qubit and the resonator can be varied by parametrically modulating the qubit frequency. In this way, the coupling is directly proportional to the parametric modulation amplitude and covers a broad tunable range beyond 20 MHz for the sample we used. Additionally, when emitting shaped photons, we find that the spurious frequency shift (-0.4 MHz) due to parametric modulation is small and can be readily calibrated through chirping. We develop an efficient photon field measurement setup based on the data stream processing of GPU. Utilizing this system, we perform photon temporal profile measurement, quantum state tomography of photon field, and quantum process tomography of single-rail quantum state transfer based on a heterodyne measurement scheme. The single-rail encoding state transfer fidelity of shaped photon emission is 90.32%, and that for unshaped photon is 97.20%, respectively. We believe that the fidelity of shaped photon emission is mainly limited by the qubit coherence time. The results demonstrate that our method is hardware efficient, simple to implement, and scalable. It could become a viable tool in a high-quality quantum network utilizing both single-rail and dual-rail time-bin encoding.
Published: 2024

7. NeRF-Guided Unsupervised Learning of RGB-D Registration

Author: Yu, Zhinan, Qin, Zheng, Tang, Yijie, Wang, Yongjun, Yi, Renjiao, Zhu, Chenyang, and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: This paper focuses on training a robust RGB-D registration model without ground-truth pose supervision. Existing methods usually adopt a pairwise training strategy based on differentiable rendering, which enforces the photometric and the geometric consistency between the two registered frames as supervision. However, this frame-to-frame framework suffers from poor multi-view consistency due to factors such as lighting changes, geometry occlusion and reflective materials. In this paper, we present NeRF-UR, a novel frame-to-model optimization framework for unsupervised RGB-D registration. Instead of frame-to-frame consistency, we leverage the neural radiance field (NeRF) as a global model of the scene and use the consistency between the input and the NeRF-rerendered frames for pose optimization. This design can significantly improve the robustness in scenarios with poor multi-view consistency and provides better learning signal for the registration model. Furthermore, to bootstrap the NeRF optimization, we create a synthetic dataset, Sim-RGBD, through a photo-realistic simulator to warm up the registration model. By first training the registration model on Sim-RGBD and later unsupervisedly fine-tuning on real data, our framework enables distilling the capability of feature extraction and registration from simulation to reality. Our method outperforms the state-of-the-art counterparts on two popular indoor RGB-D datasets, ScanNet and 3DMatch. Code and models will be released for paper reproduction.
Published: 2024

8. Tunable coupling of a quantum phononic resonator to a transmon qubit with flip-chip architecture

Author: Ruan, Xinhui, Li, Li, Liang, Guihan, Zhao, Silu, Wang, Jia-heng, Bu, Yizhou, Chen, Bingjie, Song, Xiaohui, Li, Xiang, Zhang, He, Wang, Jinzhe, Zhao, Qianchuan, Xu, Kai, Fan, Heng, Liu, Yu-xi, Zhang, Jing, Peng, Zhihui, Xiang, Zhongcheng, and Zheng, Dongning
Subjects: Quantum Physics
Abstract: A hybrid system with tunable coupling between phonons and qubits shows great potential for advancing quantum information processing. In this work, we demonstrate strong and tunable coupling between a surface acoustic wave (SAW) resonator and a transmon qubit based on galvanic-contact flip-chip technique. The coupling strength varies from $2\pi\times$7.0 MHz to -$2\pi\times$20.6 MHz, which is extracted from different vacuum Rabi oscillation frequencies. The phonon-induced ac Stark shift of the qubit at different coupling strengths is also shown. Our approach offers a good experimental platform for exploring quantum acoustics and hybrid systems.
Published: 2024

9. Learning Cross-hand Policies for High-DOF Reaching and Grasping

Author: She, Qijin, Zhang, Shishun, Ye, Yunfan, Liu, Min, Hu, Ruizhen, and Xu, Kai
Subjects: Computer Science - Robotics, Computer Science - Graphics
Abstract: Reaching-and-grasping is a fundamental skill for robotic manipulation, but existing methods usually train models on a specific gripper and cannot be reused on another gripper without retraining. In this paper, we propose a novel method that can learn a unified policy model that can be easily transferred to different dexterous grippers. Our method consists of two stages: a gripper-agnostic policy model that predicts the displacements of predefined key points on the gripper, and a gripper specific adaptation model that translates these displacements into adjustments for controlling the grippers' joints. The gripper state and interactions with objects are captured at the finger level using robust geometric representations, integrated with a transformer-based network to address variations in gripper morphology and geometry. In the experimental part, we evaluate our method on several dexterous grippers and objects of diverse shapes, and the result shows that our method significantly outperforms the baseline methods. Pioneering the transfer of grasp policies across different dexterous grippers, our method effectively demonstrates its potential for learning generalizable and transferable manipulation skills for various robotic hands
Published: 2024

10. Learning Instance-Aware Correspondences for Robust Multi-Instance Point Cloud Registration in Cluttered Scenes

Author: Yu, Zhiyuan, Qin, Zheng, Zheng, Lintao, and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Multi-instance point cloud registration estimates the poses of multiple instances of a model point cloud in a scene point cloud. Extracting accurate point correspondence is to the center of the problem. Existing approaches usually treat the scene point cloud as a whole, overlooking the separation of instances. Therefore, point features could be easily polluted by other points from the background or different instances, leading to inaccurate correspondences oblivious to separate instances, especially in cluttered scenes. In this work, we propose MIRETR, Multi-Instance REgistration TRansformer, a coarse-to-fine approach to the extraction of instance-aware correspondences. At the coarse level, it jointly learns instance-aware superpoint features and predicts per-instance masks. With instance masks, the influence from outside of the instance being concerned is minimized, such that highly reliable superpoint correspondences can be extracted. The superpoint correspondences are then extended to instance candidates at the fine level according to the instance masks. At last, an efficient candidate selection and refinement algorithm is devised to obtain the final registrations. Extensive experiments on three public benchmarks demonstrate the efficacy of our approach. In particular, MIRETR outperforms the state of the arts by 16.6 points on F1 score on the challenging ROBI benchmark. Code and models are available at https://github.com/zhiyuanYU134/MIRETR.
Published: 2024

11. Black Hole Entropy for M-theory on the Quintic Threefold via F-theoretic Strings

Author: Halder, Indranil, Vafa, Cumrun, and Xu, Kai
Subjects: High Energy Physics - Theory, Mathematical Physics
Abstract: Microscopic black hole entropy calculations in string theory usually proceeds through identifying them as wrapped strings in one higher dimension. For M-theory on elliptic Calabi-Yau threefolds this proceeds via its relation to F-theory in one higher dimension. Here we show how this method can be extended to M-theory on non-elliptic Calabi-Yau threefolds such as the quintic via conifold transition to elliptic threefolds. This leads to the computation of the black hole entropy through elliptic genera of the strings. However the Cardy formula for the computation of the black hole entropy of these strings fails because the relevant momentum excitations on the string are much smaller than the central charge of the strings. We show how the black hole attractor entropy formula leads to predicting corrections to the Cardy formula in this regime., Comment: 16 pages
Published: 2024

12. InterFusion: Text-Driven Generation of 3D Human-Object Interaction

Author: Dai, Sisi, Li, Wenhao, Sun, Haowen, Huang, Haibin, Ma, Chongyang, Huang, Hui, Xu, Kai, and Hu, Ruizhen
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner. We identify and address two key challenges: the unsatisfactory outcomes of direct text-to-3D methods in HOI, largely due to the lack of paired text-interaction data, and the inherent difficulties in simultaneously generating multiple concepts with complex spatial relationships. To effectively address these issues, we present InterFusion, a two-stage framework specifically designed for HOI generation. InterFusion involves human pose estimations derived from text as geometric priors, which simplifies the text-to-3D conversion process and introduces additional constraints for accurate object generation. At the first stage, InterFusion extracts 3D human poses from a synthesized image dataset depicting a wide range of interactions, subsequently mapping these poses to interaction descriptions. The second stage of InterFusion capitalizes on the latest developments in text-to-3D generation, enabling the production of realistic and high-quality 3D HOI scenes. This is achieved through a local-global optimization process, where the generation of human body and object is optimized separately, and jointly refined with a global optimization of the entire scene, ensuring a seamless and contextually coherent integration. Our experimental results affirm that InterFusion significantly outperforms existing state-of-the-art methods in 3D HOI generation.
Published: 2024

13. Surface Reconstruction from Point Clouds via Grid-based Intersection Prediction

Author: Tian, Hui and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Surface reconstruction from point clouds is a crucial task in the fields of computer vision and computer graphics. SDF-based methods excel at reconstructing smooth meshes with minimal error and artefacts but struggle with representing open surfaces. On the other hand, UDF-based methods can effectively represent open surfaces but often introduce noise, leading to artefacts in the mesh. In this work, we propose a novel approach that directly predicts the intersection points between line segment of point pairs and implicit surfaces. To achieve it, we propose two modules named Relative Intersection Module and Sign Module respectively with the feature of point pair as input. To preserve the continuity of the surface, we also integrate symmetry into the two modules, which means the position of predicted intersection will not change even if the input order of the point pair changes. This method not only preserves the ability to represent open surfaces but also eliminates most artefacts on the mesh. Our approach demonstrates state-of-the-art performance on three datasets: ShapeNet, MGN, and ScanNet. The code will be made available upon acceptance.
Published: 2024

14. Exploring Hilbert-Space Fragmentation on a Superconducting Processor

Author: Wang, Yong-Yi, Shi, Yun-Hao, Sun, Zheng-Hang, Chen, Chi-Tong, Wang, Zheng-An, Zhao, Kui, Liu, Hao-Tian, Ma, Wei-Guo, Wang, Ziting, Li, Hao, Zhang, Jia-Chi, Liu, Yu, Deng, Cheng-Lin, Li, Tian-Ming, He, Yang, Liu, Zheng-He, Peng, Zhen-Yu, Song, Xiaohui, Xue, Guangming, Yu, Haifeng, Huang, Kaixuan, Xiang, Zhongcheng, Zheng, Dongning, Xu, Kai, and Fan, Heng
Subjects: Quantum Physics, Condensed Matter - Disordered Systems and Neural Networks, Condensed Matter - Statistical Mechanics
Abstract: Isolated interacting quantum systems generally thermalize, yet there are several counterexamples for the breakdown of ergodicity, such as many-body localization and quantum scars. Recently, ergodicity breaking has been observed in systems subjected to linear potentials, termed Stark many-body localization. This phenomenon is closely associated with Hilbert-space fragmentation, characterized by a strong dependence of dynamics on initial conditions. Here, we experimentally explore initial-state dependent dynamics using a ladder-type superconducting processor with up to 24 qubits, which enables precise control of the qubit frequency and initial state preparation. In systems with linear potentials, we observe distinct non-equilibrium dynamics for initial states with the same quantum numbers and energy, but with varying domain wall numbers. This distinction becomes increasingly pronounced as the system size grows, in contrast with disordered interacting systems. Our results provide convincing experimental evidence of the fragmentation in Stark systems, enriching our understanding of the weak breakdown of ergodicity., Comment: main text: 7 pages, 4 figures; supplementary: 13 pages, 14 figures
Published: 2024

15. PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation

Author: Guo, Yuhan, Shao, Hanning, Liu, Can, Xu, Kai, and Yuan, Xiaoru
Subjects: Computer Science - Human-Computer Interaction
Abstract: Generative text-to-image models, which allow users to create appealing images through a text prompt, have seen a dramatic increase in popularity in recent years. However, most users have a limited understanding of how such models work and it often requires many trials and errors to achieve satisfactory results. The prompt history contains a wealth of information that could provide users with insights into what have been explored and how the prompt changes impact the output image, yet little research attention has been paid to the visual analysis of such process to support users. We propose the Image Variant Graph, a novel visual representation designed to support comparing prompt-image pairs and exploring the editing history. The Image Variant Graph models prompt differences as edges between corresponding images and presents the distances between images through projection. Based on the graph, we developed the PrompTHis system through co-design with artists. Besides Image Variant Graph, PrompTHis also incorporates a detailed prompt-image history and a navigation mini-map. Based on the review and analysis of the prompting history, users can better understand the impact of prompt changes and have a more effective control of image generation. A quantitative user study with eleven amateur participants and qualitative interviews with five professionals and one amateur user were conducted to evaluate the effectiveness of PrompTHis. The results demonstrate PrompTHis can help users review the prompt history, make sense of the model, and plan their creative process.
Published: 2024

16. Synchronized Dual-arm Rearrangement via Cooperative mTSP

Author: Li, Wenhao, Zhang, Shishun, Dai, Sisi, Huang, Hui, Hu, Ruizhen, Chen, Xiaohong, and Xu, Kai
Subjects: Computer Science - Robotics
Abstract: Synchronized dual-arm rearrangement is widely studied as a common scenario in industrial applications. It often faces scalability challenges due to the computational complexity of robotic arm rearrangement and the high-dimensional nature of dual-arm planning. To address these challenges, we formulated the problem as cooperative mTSP, a variant of mTSP where agents share cooperative costs, and utilized reinforcement learning for its solution. Our approach involved representing rearrangement tasks using a task state graph that captured spatial relationships and a cooperative cost matrix that provided details about action costs. Taking these representations as observations, we designed an attention-based network to effectively combine them and provide rational task scheduling. Furthermore, a cost predictor is also introduced to directly evaluate actions during both training and planning, significantly expediting the planning process. Our experimental results demonstrate that our approach outperforms existing methods in terms of both performance and planning efficiency.
Published: 2024

17. LAB: Large-Scale Alignment for ChatBots

Author: Sudalairaj, Shivchander, Bhandwaldar, Abhishek, Pareja, Aldo, Xu, Kai, Cox, David D., and Srivastava, Akash
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: This work introduces LAB (Large-scale Alignment for chatBots), a novel methodology designed to overcome the scalability challenges in the instruction-tuning phase of large language model (LLM) training. Leveraging a taxonomy-guided synthetic data generation process and a multi-phase tuning framework, LAB significantly reduces reliance on expensive human annotations and proprietary models like GPT-4. We demonstrate that LAB-trained models can achieve competitive performance across several benchmarks compared to models trained with traditional human-annotated or GPT-4 generated synthetic data. Thus offering a scalable, cost-effective solution for enhancing LLM capabilities and instruction-following behaviors without the drawbacks of catastrophic forgetting, marking a step forward in the efficient training of LLMs for a wide range of applications., Comment: Corresponding Author: Akash Srivastava. Equal Contribution: Shivchander Sudalairaj, Abhishek Bhandwaldar, Aldo Pareja, Akash Srivastava, Code: https://github.com/instructlab
Published: 2024

18. Design and Practice of Digital Test and Verification of BeiDou Navigation Satellite System

Author: Wang, Wei, primary, Guo, Shuren, additional, Lu, Jun, additional, Gao, Weiguang, additional, Chai, Qiang, additional, Zhang, Gong, additional, Xu, Kai, additional, and Liu, Wenxiang, additional
Published: 2023
Full Text: View/download PDF

19. High-order topological pumping on a superconducting quantum processor

Author: Deng, Cheng-Lin, Liu, Yu, Zhang, Yu-Ran, Li, Xue-Gang, Liu, Tao, Chen, Chi-Tong, Liu, Tong, Lu, Cong-Wei, Wang, Yong-Yi, Li, Tian-Ming, Fang, Cai-Ping, Zhou, Si-Yun, Song, Jia-Cheng, Xu, Yue-Shan, He, Yang, Liu, Zheng-He, Huang, Kai-Xuan, Xiang, Zhong-Cheng, Wang, Jie-Ci, Zheng, Dong-Ning, Xue, Guang-Ming, Xu, Kai, Yu, H. F., and Fan, Heng
Subjects: Quantum Physics
Abstract: High-order topological phases of matter refer to the systems of $n$-dimensional bulk with the topology of $m$-th order, exhibiting $(n-m)$-dimensional boundary modes and can be characterized by topological pumping. Here, we experimentally demonstrate two types of second-order topological pumps, forming four 0-dimensional corner localized states on a 4$\times$4 square lattice array of 16 superconducting qubits. The initial ground state of the system for half-filling, as a product of four identical entangled 4-qubit states, is prepared using an adiabatic scheme. During the pumping procedure, we adiabatically modulate the superlattice Bose-Hubbard Hamiltonian by precisely controlling both the hopping strengths and on-site potentials. At the half pumping period, the system evolves to a corner-localized state in a quadrupole configuration. The robustness of the second-order topological pump is also investigated by introducing different on-site disorder. Our work studies the topological properties of high-order topological phases from the dynamical transport picture using superconducting qubits, which would inspire further research on high-order topological phases.
Published: 2024

20. Reducing multivariate independence testing to two bivariate means comparisons

Author: Xu, Kai, Zhou, Yeqing, Zhu, Liping, and Li, Runze
Subjects: Statistics - Methodology
Abstract: Testing for independence between two random vectors is a fundamental problem in statistics. It is observed from empirical studies that many existing omnibus consistent tests may not work well for some strongly nonmonotonic and nonlinear relationships. To explore the reasons behind this issue, we novelly transform the multivariate independence testing problem equivalently into checking the equality of two bivariate means. An important observation we made is that the power loss is mainly due to cancellation of positive and negative terms in dependence metrics, making them very close to zero. Motivated by this observation, we propose a class of consistent metrics with a positive integer $\gamma$ that exactly characterize independence. Theoretically, we show that the metrics with even and infinity $\gamma$ can effectively avoid the cancellation, and have high powers under the alternatives that two mean differences offset each other. Since we target at a wide range of dependence scenarios in practice, we further suggest to combine the p-values of test statistics with different $\gamma$'s through the Fisher's method. We illustrate the advantages of our proposed tests through extensive numerical studies.
Published: 2024

21. Learning Dual-arm Object Rearrangement for Cartesian Robots

Author: Zhang, Shishun, She, Qijin, Li, Wenhao, Zhu, Chenyang, Wang, Yongjun, Hu, Ruizhen, and Xu, Kai
Subjects: Computer Science - Robotics, Computer Science - Machine Learning
Abstract: This work focuses on the dual-arm object rearrangement problem abstracted from a realistic industrial scenario of Cartesian robots. The goal of this problem is to transfer all the objects from sources to targets with the minimum total completion time. To achieve the goal, the core idea is to develop an effective object-to-arm task assignment strategy for minimizing the cumulative task execution time and maximizing the dual-arm cooperation efficiency. One of the difficulties in the task assignment is the scalability problem. As the number of objects increases, the computation time of traditional offline-search-based methods grows strongly for computational complexity. Encouraged by the adaptability of reinforcement learning (RL) in long-sequence task decisions, we propose an online task assignment decision method based on RL, and the computation time of our method only increases linearly with the number of objects. Further, we design an attention-based network to model the dependencies between the input states during the whole task execution process to help find the most reasonable object-to-arm correspondence in each task assignment round. In the experimental part, we adapt some search-based methods to this specific setting and compare our method with them. Experimental result shows that our approach achieves outperformance over search-based methods in total execution time and computational efficiency, and also verifies the generalization of our method to different numbers of objects. In addition, we show the effectiveness of our method deployed on the real robot in the supplementary video., Comment: 7 pages, 9 figures, conference
Published: 2024

22. A Riemann-Hilbert approach to the two-component modified Camassa-Holm equation on the line

Author: Xu, Kai, Ju, Luman, and Fan, Engui
Subjects: Mathematical Physics, 35Q51, 35Q15, 37K15, 35C20
Abstract: In this paper, we develop a Riemann-Hilbert approach to the Cauchy problem for the two-component modified Camassa-Holm (2-mCH) equation based on its Lax pair. Further via a series of deformations to the Riemann-Hilbert problem associated with the Cauchy problem by using the $\bar{\partial}$-generalization of Deift-Zhou steepest descent method, we obtain the long-time asymptotic approximations of the solutions of the 2-mCH equation in four space-time regions. Our results also confirm the soliton resolution conjecture and asymptotically stability of the $N$-soliton solutions for the 2-mCH equation., Comment: 30 pages
Published: 2024

23. Dynamics of quantum coherence in many-body localized systems

Author: Chen, Jin-Jun, Xu, Kai, Ren, Li-Hang, Zhang, Yu-Ran, and Fan, Heng
Subjects: Quantum Physics, Condensed Matter - Mesoscale and Nanoscale Physics
Abstract: We demonstrate that the dynamics of quantum coherence serves as an effective probe for identifying dephasing, which is a distinctive signature of many-body localization (MBL). Quantum coherence can be utilized to measure both the local coherence of specific subsystems and the total coherence of the whole system in a consistent manner. Our results reveal that the local coherence of small subsystems decays over time following a power law in the MBL phase, while it reaches a stable value within the same time window in the Anderson localized (AL) phase. In contrast, the total coherence of the whole system exhibits logarithmic growth during the MBL phase and reaches a stable value in the AL phase. Notably, this dynamic characteristic of quantum coherence remains robust even with weak interactions and displays unbounded behavior in infinite systems. Our results provide insights into understanding many-body dephasing phenomena in MBL systems and propose a novel feasible method for identifying and characterizing MBL phases in experiments.
Published: 2024

24. Conversational Crowdsensing: A Parallel Intelligence Powered Novel Sensing Approach

Author: Zhu, Zhengqiu, Zhao, Yong, Chen, Bin, Qiu, Sihang, Xu, Kai, Yin, Quanjun, Huang, Jincai, Liu, Zhong, and Wang, Fei-Yue
Subjects: Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction
Abstract: The transition from CPS-based Industry 4.0 to CPSS-based Industry 5.0 brings new requirements and opportunities to current sensing approaches, especially in light of recent progress in Chatbots and Large Language Models (LLMs). Therefore, the advancement of parallel intelligence-powered Crowdsensing Intelligence (CSI) is witnessed, which is currently advancing towards linguistic intelligence. In this paper, we propose a novel sensing paradigm, namely conversational crowdsensing, for Industry 5.0. It can alleviate workload and professional requirements of individuals and promote the organization and operation of diverse workforce, thereby facilitating faster response and wider popularization of crowdsensing systems. Specifically, we design the architecture of conversational crowdsensing to effectively organize three types of participants (biological, robotic, and digital) from diverse communities. Through three levels of effective conversation (i.e., inter-human, human-AI, and inter-AI), complex interactions and service functionalities of different workers can be achieved to accomplish various tasks across three sensing phases (i.e., requesting, scheduling, and executing). Moreover, we explore the foundational technologies for realizing conversational crowdsensing, encompassing LLM-based multi-agent systems, scenarios engineering and conversational human-AI cooperation. Finally, we present potential industrial applications of conversational crowdsensing and discuss its implications. We envision that conversations in natural language will become the primary communication channel during crowdsensing process, enabling richer information exchange and cooperative problem-solving among humans, robots, and AI.
Published: 2024

25. GliDe with a CaPE: A Low-Hassle Method to Accelerate Speculative Decoding

Author: Du, Cunxiao, Jiang, Jing, Yuanchen, Xu, Wu, Jiawei, Yu, Sicheng, Li, Yongqi, Li, Shenggui, Xu, Kai, Nie, Liqiang, Tu, Zhaopeng, and You, Yang
Subjects: Computer Science - Computation and Language
Abstract: Speculative decoding is a relatively new decoding framework that leverages small and efficient draft models to reduce the latency of LLMs. In this study, we introduce GliDe and CaPE, two low-hassle modifications to vanilla speculative decoding to further improve the decoding speed of a frozen LLM. Specifically, GliDe is a modified draft model architecture that reuses the cached keys and values from the target LLM, while CaPE is a proposal expansion method that uses the draft model's confidence scores to help select additional candidate tokens for verification. Extensive experiments on different benchmarks demonstrate that our proposed GliDe draft model significantly reduces the expected decoding latency. Additional evaluation using walltime reveals that GliDe can accelerate Vicuna models up to 2.17x and further extend the improvement to 2.61x with CaPE. We will release our code, data, and the trained draft models.
Published: 2024

26. Experimental genuine quantum nonlocality in the triangle network

Author: Wang, Ning-Ning, Zhang, Chao, Cao, Huan, Xu, Kai, Liu, Bi-Heng, Huang, Yun-Feng, Li, Chuan-Feng, Guo, Guang-Can, Gisin, Nicolas, Kriváchy, Tamás, and Renou, Marc-Olivier
Subjects: Quantum Physics
Abstract: In the last decade, it was understood that quantum networks involving several independent sources of entanglement which are distributed and measured by several parties allowed for completely novel forms of nonclassical quantum correlations, when entangled measurements are performed. Here, we experimentally obtain quantum correlations in a triangle network structure, and provide solid evidence of its nonlocality. Specifically, we first obtain the elegant distribution proposed in (Entropy 21, 325) by performing a six-photon experiment. Then, we justify its nonlocality based on machine learning tools to estimate the distance of the experimentally obtained correlation to the local set, and through the violation of a family of conjectured inequalities tailored for the triangle network.
Published: 2024

27. The Risk of Federated Learning to Skew Fine-Tuning Features and Underperform Out-of-Distribution Robustness

Author: Du, Mengyao, Zhang, Miao, Pu, Yuwen, Xu, Kai, Ji, Shouling, and Yin, Quanjun
Subjects: Computer Science - Machine Learning
Abstract: To tackle the scarcity and privacy issues associated with domain-specific datasets, the integration of federated learning in conjunction with fine-tuning has emerged as a practical solution. However, our findings reveal that federated learning has the risk of skewing fine-tuning features and compromising the out-of-distribution robustness of the model. By introducing three robustness indicators and conducting experiments across diverse robust datasets, we elucidate these phenomena by scrutinizing the diversity, transferability, and deviation within the model feature space. To mitigate the negative impact of federated learning on model robustness, we introduce GNP, a \underline{G}eneral \underline{N}oisy \underline{P}rojection-based robust algorithm, ensuring no deterioration of accuracy on the target distribution. Specifically, the key strategy for enhancing model robustness entails the transfer of robustness from the pre-trained model to the fine-tuned model, coupled with adding a small amount of Gaussian noise to augment the representative capacity of the model. Comprehensive experimental results demonstrate that our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods and confronting different levels of data heterogeneity., Comment: 12 pages, 10 figures
Published: 2024

28. DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection

Author: Ye, Yunfan, Xu, Kai, Huang, Yuhang, Yi, Renjiao, and Cai, Zhiping
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Limited by the encoder-decoder architecture, learning-based edge detectors usually have difficulty predicting edge maps that satisfy both correctness and crispness. With the recent success of the diffusion probabilistic model (DPM), we found it is especially suitable for accurate and crisp edge detection since the denoising process is directly applied to the original image size. Therefore, we propose the first diffusion model for the task of general edge detection, which we call DiffusionEdge. To avoid expensive computational resources while retaining the final performance, we apply DPM in the latent space and enable the classic cross-entropy loss which is uncertainty-aware in pixel level to directly optimize the parameters in latent space in a distillation manner. We also adopt a decoupled architecture to speed up the denoising process and propose a corresponding adaptive Fourier filter to adjust the latent features of specific frequencies. With all the technical designs, DiffusionEdge can be stably trained with limited resources, predicting crisp and accurate edge maps with much fewer augmentation strategies. Extensive experiments on four edge detection benchmarks demonstrate the superiority of DiffusionEdge both in correctness and crispness. On the NYUDv2 dataset, compared to the second best, we increase the ODS, OIS (without post-processing) and AC by 30.2%, 28.1% and 65.1%, respectively. Code: https://github.com/GuHuangAI/DiffusionEdge., Comment: AAAI 2024
Published: 2024

29. Disorder-induced topological pumping on a superconducting quantum processor

Author: Liu, Yu, Zhang, Yu-Ran, Shi, Yun-Hao, Liu, Tao, Lu, Congwei, Wang, Yong-Yi, Li, Hao, Li, Tian-Ming, Deng, Cheng-Lin, Zhou, Si-Yun, Liu, Tong, Zhang, Jia-Chi, Liang, Gui-Han, Mei, Zheng-Yang, Ma, Wei-Guo, Liu, Hao-Tian, Liu, Zheng-He, Chen, Chi-Tong, Huang, Kaixuan, Song, Xiaohui, Zhao, SP, Tian, Ye, Xiang, Zhongcheng, Zheng, Dongning, Nori, Franco, Xu, Kai, and Fan, Heng
Subjects: Quantum Physics
Abstract: Thouless pumping, a dynamical version of the integer quantum Hall effect, represents the quantized charge pumped during an adiabatic cyclic evolution. Here we report experimental observations of nontrivial topological pumping that is induced by disorder even during a topologically trivial pumping trajectory. With a 41-qubit superconducting quantum processor, we develop a Floquet engineering technique to realize cycles of adiabatic pumping by simultaneously varying the on-site potentials and the hopping couplings. We demonstrate Thouless pumping in the presence of disorder and show its breakdown as the strength of disorder increases. Moreover, we observe two types of topological pumping that are induced by on-site potential disorder and hopping disorder, respectively. Especially, an intrinsic topological pump that is induced by quasi-periodic hopping disorder has never been experimentally realized before. Our highly controllable system provides a valuable quantum simulating platform for studying various aspects of topological physics in the presence of disorder.
Published: 2024

30. Recovery of damaged information via scrambling in indefinite casual order

Author: Jin, Tian-Ren, Li, Tian-Ming, Wang, Zheng-An, Xu, Kai, Zhang, Yu-Ran, and Fan, Heng
Subjects: Quantum Physics
Abstract: Scrambling prevents the access to local information with local operators and therefore can be used to protect quantum information from damage caused by local perturbations. Even though partial quantum information can be recovered if the type of the damage is known, the initial target state cannot be completely recovered, because the obtained state is a mixture of the initial state and a maximally mixed state. Here, we demonstrate an improved scheme to recover damaged quantum information via scrambling in indefinite casual order. We can record the type of damage and improve the fidelity of the recovered quantum state with respect to the original one. Moreover, by iterating the schemes, the initial quantum state can be completely retrieved. In addition, we experimentally demonstrate our schemes on the cloud-based quantum computer, named as Quafu. Our work proposes a feasible scheme to protect whole quantum information from damage, which is also compatible with other techniques such as quantum error corrections and entanglement purification protocols.
Published: 2023

31. Latent Space Explorer: Visual Analytics for Multimodal Latent Space Exploration

Author: Kwon, Bum Chul, Friedman, Samuel, Xu, Kai, Lubitz, Steven A, Philippakis, Anthony, Batra, Puneet, Ellinor, Patrick T, and Ng, Kenney
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Human-Computer Interaction, Electrical Engineering and Systems Science - Signal Processing
Abstract: Machine learning models built on training data with multiple modalities can reveal new insights that are not accessible through unimodal datasets. For example, cardiac magnetic resonance images (MRIs) and electrocardiograms (ECGs) are both known to capture useful information about subjects' cardiovascular health status. A multimodal machine learning model trained from large datasets can potentially predict the onset of heart-related diseases and provide novel medical insights about the cardiovascular system. Despite the potential benefits, it is difficult for medical experts to explore multimodal representation models without visual aids and to test the predictive performance of the models on various subpopulations. To address the challenges, we developed a visual analytics system called Latent Space Explorer. Latent Space Explorer provides interactive visualizations that enable users to explore the multimodal representation of subjects, define subgroups of interest, interactively decode data with different modalities with the selected subjects, and inspect the accuracy of the embedding in downstream prediction tasks. A user study was conducted with medical experts and their feedback provided useful insights into how Latent Space Explorer can help their analysis and possible new direction for further development in the medical domain., Comment: 7 pages, 5 figures
Published: 2023

32. Scalar curvature and volume entropy of hyperbolic 3-manifolds

Author: Kazaras, Demetre, Song, Antoine, and Xu, Kai
Subjects: Mathematics - Differential Geometry, Mathematics - Geometric Topology, 53C20, 57K32
Abstract: We show that any closed hyperbolic 3-manifold M admits a Riemannian metric with scalar curvature at least -6, but with volume entropy strictly larger than 2. In particular, this construction gives counterexamples to a conjecture of I. Agol, P. Storm and W. Thurston., Comment: 21 pages, 2 figures. Comments welcome!
Published: 2023

33. DAP: Domain-aware Prompt Learning for Vision-and-Language Navigation

Author: Liu, Ting, Hu, Yue, Wu, Wansen, Wang, Youkai, Xu, Kai, and Yin, Quanjun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Following language instructions to navigate in unseen environments is a challenging task for autonomous embodied agents. With strong representation capabilities, pretrained vision-and-language models are widely used in VLN. However, most of them are trained on web-crawled general-purpose datasets, which incurs a considerable domain gap when used for VLN tasks. To address the problem, we propose a novel and model-agnostic domain-aware prompt learning (DAP) framework. For equipping the pretrained models with specific object-level and scene-level cross-modal alignment in VLN tasks, DAP applies a low-cost prompt tuning paradigm to learn soft visual prompts for extracting in-domain image semantics. Specifically, we first generate a set of in-domain image-text pairs with the help of the CLIP model. Then we introduce soft visual prompts in the input space of the visual encoder in a pretrained model. DAP injects in-domain visual knowledge into the visual encoder of the pretrained model in an efficient way. Experimental results on both R2R and REVERIE show the superiority of DAP compared to existing state-of-the-art methods., Comment: 4 pages. arXiv admin note: substantial text overlap with arXiv:2309.03661
Published: 2023

34. Demonstration of Maxwell Demon-assistant Einstein-Podolsky-Rosen Steering via Superconducting Quantum Processor

Author: Wang, Z. T., Wang, Ruixia, Zhao, Peng, Yang, Z. H., Huang, Kaixuan, Xu, Kai, Zhang, Yong-Sheng, Fan, Heng, Zhao, S. P., Hu, Meng-Jun, and Yu, Haifeng
Subjects: Quantum Physics
Abstract: The concept of Maxwell demon plays an essential role in connecting thermodynamics and information theory, while entanglement and non-locality are fundamental features of quantum theory. Given the rapid advancements in the field of quantum information science, there is a growing interest and significance in investigating the connection between Maxwell demon and quantum correlation. The majority of research endeavors thus far have been directed towards the extraction of work from quantum correlation through the utilization of Maxwell demon. Recently, a novel concept called Maxwell demon-assistant Einstein-Podolsky-Rosen (EPR) steering has been proposed, which suggests that it is possible to simulate quantum correlation by doing work. This seemingly counterintuitive conclusion is attributed to the fact that Alice and Bob need classical communication during EPR steering task, a requirement that does not apply in the Bell test. In this study, we demonstrate Maxwell demon-assistant EPR steering with superconducting quantum circuits. By compiling and optimizing a quantum circuit to be implemented on a 2D superconducting chip, we were able to achieve a steering parameter of $S_{2} = 0.770 \pm 0.005$ in the case of two measurement settings, which surpasses the classical bound of $1/\sqrt{2}$ by 12.6 standard deviations. In addition, experimental observations have revealed a linear correlation between the non-locality demonstrated in EPR steering and the work done by the demon. Considering the errors in practical operation, the experimental results are highly consistent with theoretical predictions. Our findings not only suggest the presence of a Maxwell demon loophole in the EPR steering, but also contribute to a deeper comprehension of the interplay between quantum correlation, information theory, and thermodynamics., Comment: Comments are welcome!
Published: 2023

35. Experimental verification of the steering ellipsoid zoo via two-qubit states

Author: Xu, Kai, Liu, Lijun, Wang, Ning-Ning, Zhang, Chao, Huang, Yun-Feng, Liu, Bi-Heng, Cheng, Shuming, Li, Chuan-Feng, and Guo, Guang-Can
Subjects: Quantum Physics
Abstract: Quantum steering ellipsoid visualizes the set of all qubit states that can be steered by measuring on another correlated qubit in the Bloch picture. Together with local reduced states, it provides a faithful geometric characterization of the underlying two-qubit state so that almost all nonclassical state features can be reflected in its geometric properties. Consequently, the various types of quantum ellipsoids with different geometric properties form an ellipsoid zoo, which, in this work, is experimentally verified via measurements on many polarization-path photonic states. By generating two-qubit states with high fidelity, the corresponding ellipsoids are constructed to certify the presence of entanglement, one-way Einstein-Podolsky-Rosen steering, discord, and steering incompleteness. It is also experimentally verified that the steering ellipsoid can be reconstructed from using the twelve vertices of the icosahedron as measurement directions. Our results aid progress in applying the quantum steering ellipsoid to reveal nonclassical features of the multi-qubit system., Comment: Several references are added and the presentation is close to the publication version
Published: 2023
Full Text: View/download PDF

36. Polyhedral Surface: Self-supervised Point Cloud Reconstruction Based on Polyhedral Surface

Author: Tian, Hui and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Point cloud reconstruction from raw point cloud has been an important topic in computer graphics for decades, especially due to its high demand in modeling and rendering applications. An important way to solve this problem is establishing a local geometry to fit the local curve. However, previous methods build either a local plane or polynomial curve. Local plane brings the loss of sharp feature and the boundary artefacts on open surface. Polynomial curve is hard to combine with neural network due to the local coordinate consistent problem. To address this, we propose a novel polyhedral surface to represent local surface. This method provides more flexible to represent sharp feature and surface boundary on open surface. It does not require any local coordinate system, which is important when introducing neural networks. Specifically, we use normals to construct the polyhedral surface, including both dihedral and trihedral surfaces using 2 and 3 normals, respectively. Our method achieves state-of-the-art results on three commonly used datasets (ShapeNetCore, ABC, and ScanNet). Code will be released upon acceptance.
Published: 2023

37. $P_c$ states in the mixture of molecular and pentaquark pictures

Author: Xu, Kai, Phumphan, Kanokphon, Ruangyoo, Wiriya, Chen, Chia-Chu, Limphirat, Ayut, and Yan, Yupeng
Subjects: High Energy Physics - Phenomenology
Abstract: We systematically study hidden charm pentaquark states in the constituent quark model with a general Hamiltonian for multiquark systems, considering the coupling between the $\Sigma_c^{(*)}\bar{D}^{(*)}$ molecular states and the $q^3c\bar c$ compact pentaquark states by the one-gluon exchange hyperfine interaction. The ground state hidden-charm pentaquark mass spectra and the strong decay widths are calculated. This work suggests that $P_c(4312)$, $P_c(4457)$ and $P_c(4380)$ resonances might be mainly $\Sigma_c \bar D$, $\Sigma_c \bar D^*$ and $\Sigma_c^* \bar D$ hadronic molecules respectively, and $P_c(4440)$ might include sizable pentaquark components.
Published: 2023

38. Zero-noise Extrapolation Assisted with Purity for Quantum Error Mitigation

Author: Jin, Tian-Ren, Shi, Yun-Hao, Wang, Zheng-An, Li, Tian-Ming, Xu, Kai, and Fan, Heng
Subjects: Quantum Physics
Abstract: Quantum error mitigation is a technique used to post-process errors occurring in the quantum system, which reduces the expected errors and achieves higher accuracy. One method of quantum error mitigation is zero-noise extrapolation, which involves amplifying the noise and then extrapolating the observable expectation of interest back to a noise-free point. This method usually relies on the error model of the noise, as error rates for different levels of noise are assumed during the noise amplification process. In this paper, we propose that the purity of output states in noisy circuits can assist in the extrapolation process, eliminating the need for assumptions about error rates. We also introduce the quasi-polynomial model from the linearity of quantum channel for extrapolation of experimental data, which can be reduced to other proposed models. Furthermore, we verify our purity-assisted zero-noise extrapolation by performing numerical simulations and experiments on the online public quantum computation platform, Quafu, to compare it with the routine zero-noise extrapolation and virtual distillation methods. Our results demonstrate that this modified method can suppress the random fluctuation of operator expectation measurement, and effectively reduces the bias in extrapolation to a level lower than both the zero-noise extrapolation and virtual distillation methods, especially when the error rate is moderate.
Published: 2023

39. Probing spin hydrodynamics on a superconducting quantum simulator

Author: Shi, Yun-Hao, Sun, Zheng-Hang, Wang, Yong-Yi, Wang, Zheng-An, Zhang, Yu-Ran, Ma, Wei-Guo, Liu, Hao-Tian, Zhao, Kui, Song, Jia-Cheng, Liang, Gui-Han, Mei, Zheng-Yang, Zhang, Jia-Chi, Li, Hao, Chen, Chi-Tong, Song, Xiaohui, Wang, Jieci, Xue, Guangming, Yu, Haifeng, Huang, Kaixuan, Xiang, Zhongcheng, Xu, Kai, Zheng, Dongning, and Fan, Heng
Subjects: Quantum Physics
Abstract: Characterizing the nature of hydrodynamical transport properties in quantum dynamics provides valuable insights into the fundamental understanding of exotic non-equilibrium phases of matter. Simulating infinite-temperature transport on large-scale complex quantum systems remains an outstanding challenge. Here, using a controllable and coherent superconducting quantum simulator, we experimentally realize the analog quantum circuit, which can efficiently prepare the Haar-random states, and probe spin transport at infinite temperature. We observe diffusive spin transport during the unitary evolution of the ladder-type quantum simulator with ergodic dynamics. Moreover, we explore the transport properties of the systems subjected to strong disorder or a titled potential, revealing signatures of anomalous subdiffusion in accompany with the breakdown of thermalization. Our work demonstrates a scalable method of probing infinite-temperature spin transport on analog quantum simulators, which paves the way to study other intriguing out-of-equilibrium phenomena from the perspective of transport., Comment: Main text: 12 pages, 7 figures; Supplementary: 12 pages, 8 figures
Published: 2023

40. Imitator Learning: Achieve Out-of-the-Box Imitation Ability in Variable Environments

Author: Chen, Xiong-Hui, Ye, Junyin, Zhao, Hang, Li, Yi-Chen, Shi, Haoran, Xu, Yu-Yan, Ye, Zhihao, Yang, Si-Hang, Huang, Anqi, Xu, Kai, Zhang, Zongzhang, and Yu, Yang
Subjects: Computer Science - Machine Learning
Abstract: Imitation learning (IL) enables agents to mimic expert behaviors. Most previous IL techniques focus on precisely imitating one policy through mass demonstrations. However, in many applications, what humans require is the ability to perform various tasks directly through a few demonstrations of corresponding tasks, where the agent would meet many unexpected changes when deployed. In this scenario, the agent is expected to not only imitate the demonstration but also adapt to unforeseen environmental changes. This motivates us to propose a new topic called imitator learning (ItorL), which aims to derive an imitator module that can on-the-fly reconstruct the imitation policies based on very limited expert demonstrations for different unseen tasks, without any extra adjustment. In this work, we focus on imitator learning based on only one expert demonstration. To solve ItorL, we propose Demo-Attention Actor-Critic (DAAC), which integrates IL into a reinforcement-learning paradigm that can regularize policies' behaviors in unexpected situations. Besides, for autonomous imitation policy building, we design a demonstration-based attention architecture for imitator policy that can effectively output imitated actions by adaptively tracing the suitable states in demonstrations. We develop a new navigation benchmark and a robot environment for \topic~and show that DAAC~outperforms previous imitation methods \textit{with large margins} both on seen and unseen tasks.
Published: 2023

41. Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement

Author: Xu, Kai, Chen, Rongyu, Franchi, Gianni, and Yao, Angela
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: The capacity of a modern deep learning system to determine if a sample falls within its realm of knowledge is fundamental and important. In this paper, we offer insights and analyses of recent state-of-the-art out-of-distribution (OOD) detection methods - extremely simple activation shaping (ASH). We demonstrate that activation pruning has a detrimental effect on OOD detection, while activation scaling enhances it. Moreover, we propose SCALE, a simple yet effective post-hoc network enhancement method for OOD detection, which attains state-of-the-art OOD detection performance without compromising in-distribution (ID) accuracy. By integrating scaling concepts into the training process to capture a sample's ID characteristics, we propose Intermediate Tensor SHaping (ISH), a lightweight method for training time OOD detection enhancement. We achieve AUROC scores of +1.85\% for near-OOD and +0.74\% for far-OOD datasets on the OpenOOD v1.5 ImageNet-1K benchmark. Our code and models are available at https://github.com/kai422/SCALE.
Published: 2023

42. Accurate and Fast Compressed Video Captioning

Author: Shen, Yaojie, Gu, Xin, Xu, Kai, Fan, Heng, Wen, Longyin, and Zhang, Libo
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Existing video captioning approaches typically require to first sample video frames from a decoded video and then conduct a subsequent process (e.g., feature extraction and/or captioning model learning). In this pipeline, manual frame sampling may ignore key information in videos and thus degrade performance. Additionally, redundant information in the sampled frames may result in low efficiency in the inference of video captioning. Addressing this, we study video captioning from a different perspective in compressed domain, which brings multi-fold advantages over the existing pipeline: 1) Compared to raw images from the decoded video, the compressed video, consisting of I-frames, motion vectors and residuals, is highly distinguishable, which allows us to leverage the entire video for learning without manual sampling through a specialized model design; 2) The captioning model is more efficient in inference as smaller and less redundant information is processed. We propose a simple yet effective end-to-end transformer in the compressed domain for video captioning that enables learning from the compressed video for captioning. We show that even with a simple design, our method can achieve state-of-the-art performance on different benchmarks while running almost 2x faster than existing approaches. Code is available at https://github.com/acherstyx/CoCap., Comment: ICCV 2023
Published: 2023

43. Drawstrings and flexibility in the Geroch conjecture

Author: Kazaras, Demetre and Xu, Kai
Subjects: Mathematics - Differential Geometry, 53C21, 53C23
Abstract: In this paper, we observe new phenomena related to the structure of 3-manifolds satisfying lower scalar curvature bounds. We construct warped-product manifolds of almost nonnegative scalar curvature that converge to pulled string spaces in the Sormani-Wenger intrinsic flat topology. These examples extend the results of Lee-Naber-Neumayer \cite{LNN} to the case of dimension $3$. As a consequence, we produce the first counterexample to a conjecture of Sormani \cite{SormaniConj} on the stability of the Geroch Conjecture. Our example tests the appropriate hypothesis for a related conjecture of Gromov. On the other hand, we demonstrate a $W^{1,p}$-stability statement ($1\leq p<2$) for the Geroch Conjecture in the class of warped products., Comment: A second proof of Theorem 3.1 was added. The exposition of the introduction was improved. 33 pages, 4 figures, comments welcome
Published: 2023

44. Prompt-based Context- and Domain-aware Pretraining for Vision and Language Navigation

Author: Liu, Ting, Hu, Yue, Wu, Wansen, Wang, Youkai, Xu, Kai, and Yin, Quanjun
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Pretrained visual-language models have extensive world knowledge and are widely used in visual and language navigation (VLN). However, they are not sensitive to indoor scenarios for VLN tasks. Another challenge for VLN is how the agent understands the contextual relations between actions on a path and performs cross-modal alignment sequentially. In this paper, we propose a novel Prompt-bAsed coNtext- and inDoor-Aware (PANDA) pretraining framework to address these problems. It performs prompting in two stages. In the indoor-aware stage, we apply an efficient tuning paradigm to learn deep visual prompts from an indoor dataset, in order to augment pretrained models with inductive biases towards indoor environments. This can enable more sample-efficient adaptation for VLN agents. Furthermore, in the context-aware stage, we design a set of hard context prompts to capture the sequence-level semantics in the instruction. They enable further tuning of the pretrained models via contrastive learning. Experimental results on both R2R and REVERIE show the superiority of PANDA compared to existing state-of-the-art methods., Comment: 12 pages
Published: 2023

45. Hybrid Massive-MIMO and Its Practical Beamforming Implementation

Author: Xu, Kai, primary, Hou, Jiayu, additional, and Ding, Yuan, additional
Published: 2023
Full Text: View/download PDF

46. Mechanical properties of basalt fiber-sisal fiber/fly ash regenerative concrete

Author: Jiang, Zhiwei, primary, Li, Zhuo, additional, and Xu, Kai, additional
Published: 2023
Full Text: View/download PDF

47. Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction

Author: Xu, Kai, Lu, Shiyu, Huang, Bin, Wu, Weiwen, and Liu, Qiegen
Subjects: Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Computer Vision and Pattern Recognition
Abstract: Diffusion models have emerged as potential tools to tackle the challenge of sparse-view CT reconstruction, displaying superior performance compared to conventional methods. Nevertheless, these prevailing diffusion models predominantly focus on the sinogram or image domains, which can lead to instability during model training, potentially culminating in convergence towards local minimal solutions. The wavelet trans-form serves to disentangle image contents and features into distinct frequency-component bands at varying scales, adeptly capturing diverse directional structures. Employing the Wavelet transform as a guiding sparsity prior significantly enhances the robustness of diffusion models. In this study, we present an innovative approach named the Stage-by-stage Wavelet Optimization Refinement Diffusion (SWORD) model for sparse-view CT reconstruction. Specifically, we establish a unified mathematical model integrating low-frequency and high-frequency generative models, achieving the solution with optimization procedure. Furthermore, we perform the low-frequency and high-frequency generative models on wavelet's decomposed components rather than sinogram or image domains, ensuring the stability of model training. Our method rooted in established optimization theory, comprising three distinct stages, including low-frequency generation, high-frequency refinement and domain transform. Our experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods both quantitatively and qualitatively.
Published: 2023

48. 'Zero change' platform for monolithic back-end-of-line integration of phase change materials in silicon photonics

Author: Wei, Maoliang, Xu, Kai, Tang, Bo, Li, Junying, Yun, Yiting, Zhang, Peng, Wu, Yingchun, Bao, Kangjian, Lei, Kunhao, Chen, Zequn, Ma, Hui, Sun, Chunlei, Liu, Ruonan, Li, Ming, Li, Lan, and Lin, Hongtao
Subjects: Physics - Optics, Physics - Applied Physics
Abstract: Monolithic integration of novel materials for unprecedented device functions without modifying the existing photonic component library is the key to advancing heterogeneous silicon photonic integrated circuits. To achieve this, the introduction of a silicon nitride etching stop layer at selective area, coupled with low-loss oxide trench to waveguide surface, enables the incorporation of various functional materials without disrupting the reliability of foundry-verified devices. As an illustration, two distinct chalcogenide phase change materials (PCM) with remarkable nonvolatile modulation capabilities, namely Sb2Se3 and Ge2Sb2Se4Te1, were monolithic back-end-of-line integrated into silicon photonics. The PCM enables compact phase and intensity tuning units with zero-static power consumption. Taking advantage of these building blocks, the phase error of a push-pull Mach-Zehnder interferometer optical switch could be trimmed by a nonvolatile phase shifter with a 48% peak power consumption reduction. Mirco-ring filters with a rejection ratio >25dB could be applied for >5-bit wavelength selective intensity modulation, and waveguide-based >7-bit intensity-modulation photonic attenuators could achieve >39dB broadband attenuation. The advanced "Zero change" back-end-of-line integration platform could not only facilitate the integration of PCMs for integrated reconfigurable photonics but also open up the possibilities for integrating other excellent optoelectronic materials in the future silicon photonic process design kits., Comment: 20 pages, 4 figures, submitted to Nature Photonics
Published: 2023

49. SuperUDF: Self-supervised UDF Estimation for Surface Reconstruction

Author: Tian, Hui, Zhu, Chenyang, Shi, Yifei, and Xu, Kai
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Learning-based surface reconstruction based on unsigned distance functions (UDF) has many advantages such as handling open surfaces. We propose SuperUDF, a self-supervised UDF learning which exploits a learned geometry prior for efficient training and a novel regularization for robustness to sparse sampling. The core idea of SuperUDF draws inspiration from the classical surface approximation operator of locally optimal projection (LOP). The key insight is that if the UDF is estimated correctly, the 3D points should be locally projected onto the underlying surface following the gradient of the UDF. Based on that, a number of inductive biases on UDF geometry and a pre-learned geometry prior are devised to learn UDF estimation efficiently. A novel regularization loss is proposed to make SuperUDF robust to sparse sampling. Furthermore, we also contribute a learning-based mesh extraction from the estimated UDFs. Extensive evaluations demonstrate that SuperUDF outperforms the state of the arts on several public datasets in terms of both quality and efficiency. Code url is https://github.com/THHHomas/SuperUDF.
Published: 2023

50. Observation of multiple steady states with engineered dissipation

Author: Li, Li, Liu, Tong, Guo, Xue-Yi, Zhang, He, Zhao, Silu, Xiang, Zhongcheng, Song, Xiaohui, Zhang, Yu-Xiang, Xu, Kai, Fan, Heng, and Zheng, Dongning
Subjects: Quantum Physics
Abstract: Simulating the dynamics of open quantum systems is essential in achieving practical quantum computation and understanding novel nonequilibrium behaviors. However, quantum simulation of a many-body system coupled to an engineered reservoir has yet to be fully explored in present-day experiment platforms. In this work, we introduce engineered noise into a one-dimensional ten-qubit superconducting quantum processor to emulate a generic many-body open quantum system. Our approach originates from the stochastic unravellings of the master equation. By measuring the end-to-end correlation, we identify multiple steady states stemmed from a strong symmetry, which is established on the modified Hamiltonian via Floquet engineering. Furthermore, we find that the information saved in the initial state maintains in the steady state driven by the continuous dissipation on a five-qubit chain. Our work provides a manageable and hardware-efficient strategy for the open-system quantum simulation.
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

11,869 results on '"Xu, Kai"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources